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ABSTRACT 

When  a  frequently-accessed  cache  item  expires,  multiple  re¬ 
quests  to  that  item  can  trigger  a  cache  miss  and  start  regen¬ 
erating  that  same  item  at  the  same  time.  This  phenomenon, 
known  as  cache  stampede,  severely  limits  the  performance 
of  databases  and  web  servers.  A  natural  countermeasure  to 
this  issue  is  to  let  the  processes  that  perform  such  requests 
to  randomly  ask  for  a  regeneration  before  the  expiration 
time  of  the  item.  In  this  paper  we  give  optimal  algorithms 
for  performing  such  probabilistic  early  expirations.  Our  al¬ 
gorithms  are  theoretically  optimal  and  have  much  better 
performances  than  other  solutions  used  in  real-world  appli¬ 
cations. 

1.  INTRODUCTION 

The  cache  stampede  problem  (also  called  dog-piling,  cache 
miss  storm,  or  cache  choking)  is  a  situation  that  occurs  when 
a  popular  cache  item  expires,  leading  to  multiple  requests 
seeing  a  cache  miss  and  regenerating  that  same  item  at  the 
same  time. 

This  issue  is  a  consequence  of  the  typical  pattern  used  for 
dealing  with  a  cache  miss  (see  Figure  1):  a  cache  item  re¬ 
quest  checks  whether  the  item  is  cached,  and  regenerates  the 
item  if  it  is  not  present.  The  flaw  with  this  approach  is  that 
many  different  requests  may  see  that  cache  miss  at  the  same 
time  and  regenerate  the  item  in  cache  through  a  potentially 
expensive  computation.  The  number  of  requests  seeing  the 
cache  miss  (and  therefore  contributing  to  the  stampede)  de¬ 
pends  not  only  on  the  request  rate,  but  also  on  the  time 
needed  to  recompute  the  item.  For  example,  if  the  cache 
item  is  accessed  10  times  per  second,  and  the  recomputation 
of  the  item  takes  3  seconds,  then  30  requests  will  recompute 
the  item. 

Having  several  re-computations  of  the  same  cache  item, 
apart  from  being  wasteful,  often  leads  to  overloading  of  the 
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system  or  slow-down  of  the  database,  which  is  also  why 
that  particular  item  was  cached  in  the  first  place.  A  cache 
stampede  is  often  referred  to  as  a  cascading  failure  because 
several  concurrent  re-computations  will  increase  the  time  of 
each  individual  recomputation  by  bogging  down  the  system, 
in  turn  causing  even  more  requests  to  be  part  of  the  stam¬ 
pede. 

Due  to  the  widespread  nature  and  the  threats  associated 
with  this  problem,  multiple  approaches  have  been  proposed 
to  mitigate  cache  stampedes. 

•  External  re- computation:  Rather  than  regenerating  the 
cache  item  by  the  requests  themselves  upon  cache  ex¬ 
piration,  have  a  separate  background  process  to  peri¬ 
odically  regenerate  the  item.  This  solution  prevents 
cache  stampedes  all  together,  but  it  is  often  discarded 
because  of  the  burden  of  maintaining  an  external  pro¬ 
cess  (need  to  enforce  availability  of  this  daemon  pro¬ 
cess,  need  for  monitoring,  code  separation/repetition, 
etc.)  This  becomes  even  more  daunting  when  dealing 
with  multiple  cache  items,  as  it  involves  keeping  track 
of  which  items  to  periodically  re-compute  and  when  — 
in  fact,  the  background  process  would  even  regenerate 
cache  items  that  were  never  requested  (which  can  be, 
in  some  settings,  a  waste  of  processing  time). 

•  Locking :  Upon  a  cache  miss,  a  request  attempts  to 
acquire  a  lock  for  that  cache  key,  and  regenerates  the 
item  only  if  it  acquires  it.  Depending  on  how  the  lock 
mechanism  is  implemented  (atomically  or  not),  this 
approach  may  mitigate  cache  stampedes  or  completely 
prevent  them.  One  issue  with  this  approach  is  that 
all  requests  that  would  have  been  part  of  a  stampede 
(apart  from  the  one  acquiring  the  lock)  have  no  cache 
item  to  return.  There  are  a  few  options:  have  the 
client  handle  the  absence  of  the  item  properly;  have 
the  requests  not  acquiring  the  lock  wait  for  the  item 
to  be  regenerated;  or,  keep  a  stale  item  in  the  cache 
to  be  used  while  the  new  value  is  generated.  Apart 
from  this  issue,  this  approach  requires  one  extra  write 
for  the  locking  mechanism  (doubling  the  number  of 
write  operations),  tuning  a  time-to-live  for  the  lock 
itself  (high  enough  to  recompute  the  item,  but  less 
than  the  re-computation  frequency),  and  implement¬ 
ing  a  locking  mechanism.  Finally,  this  approach  is  not 
fault-tolerant:  if  the  request  acquiring  the  lock  fails 
while  re-generating  the  item,  no  item  or  a  stale  item 
will  be  served  until  the  lock  expires  and  a  new  lock  is 
acquired. 
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•  Probabilistic  early  expiration :  Each  individual  request 
may  regenerate  the  item  in  cache  before  its  expira¬ 
tion  by  making  an  independent  probabilistic  decision. 
The  probability  of  performing  an  early  expiration  in¬ 
creases  as  the  request  time  gets  closer  to  the  expiration 
of  the  item.  This  approach  is  depicted  in  Figure  2: 
as  the  figure  shows,  each  request  essentially  pretends 
to  be  sometime  in  the  future  and  if  at  that  time  the 
cache  is  expired  then  it  regenerates  the  item.  Here,  the 
leap  in  the  future  depends  on  the  probabilistic  choice. 
Since  the  probabilistic  decision  is  made  independently 
by  each  request,  the  effect  of  the  stampede  is  mitigated 
as  less  requests  will  expire  at  the  same  time.  The  most 
amenable  feature  of  this  approach  is  its  simplicity.  The 
drawbacks  are  the  choice  of  the  probabilistic  decision 
(i.e.  how  to  pick  the  distribution  V  in  Figure  2);  and 
the  lack  of  guarantees  for  its  effectiveness,  in  terms 
of  assessing  both  the  stampede  reduction  and  when 
early  expirations  happen  (i.e.  how  much  earlier  than 
the  desired  expiration).  For  example,  if  the  item  in 
cache  contains  hourly  statistics,  it  is  important  not  to 
regenerate  these  stats  too  much  earlier  than  the  hour 
mark. 

In  this  paper  we  show  how  probabilistic  early  expiration 
can  be  made  extremely  effective.  In  particular,  we  present 
a  simple  instantiation  of  the  probabilistic  decision  with  the 
exponential  function  Exp(A),  which  we  demonstrate  to  be 
optimal.  A  fundamental  property  we  show  is  that  the  pa¬ 
rameter  A  needs  not  to  depend  on  the  rate  of  requests  in 
order  to  effectively  reduce  stampedes.  This  property  makes 
our  solution  very  attractive  in  that,  in  its  simplest  form,  it 
requires  no  parameter  tuning  to  perform  effectively.  This 
implementation,  which  we  call  XFetch  (for  exponential 
fetch),  is  detailed  in  Section  5.  A  higher  request  rate  for 
a  fixed  A,  while  not  affecting  the  stampede  size,  will  cause 
earlier  expirations  of  the  item  in  cache,  but  we  show  that 
this  dependency  is  very  moderate. 

The  general  problem  of  efficiently  keeping  a  cache  of  fre¬ 
quently  accessed  (but  dynamically-changing)  results  has  been 
studied  in  many  guises  and  settings,  both  from  an  applied 
and  a  theoretical  point  of  view,  e.g.:  Web  (e.g.,  dynamic  web 
pages  and  search  results  [3,4]),  networking  (e.g.,  routers’ 
look-up  tables  [14]),  databases  (e.g.,  [2]).  The  cache  stam¬ 
pede  problem  strains  many  software  systems  [1,15],  specif¬ 
ically  those  based  on  decentralized  cache  value  recomputa¬ 
tions  (e.g.,  distributed  web  servers  responding  to  web  re¬ 
quests);  these  systems  are  usually  backed  by  caches  with 
primitive  get/set  operations,  and  which  do  not  provide  lock¬ 
ing  mechanisms,  or  protection  against  stampedes  (a  notable 
example  is  the  widely-used  distributed  caching  system  Mem- 
cached  [8]).  A  number  of  systems  use  a  probabilistic  early 
expiration  strategy  to  avoid  stampedes.  Notably,  a  prob¬ 
abilistic  early  expiration  strategy  is  used  in  Perl’s  unified 
Cache  Handling  Interface  (CHI)  [11]  -  a  module  which  is 
part  of  many  web-applications  (e.g.,  Drupal  [5, 10]). 

CHI  uses  the  uniform  distribution  to  implement  early  ex¬ 
piration  —  as  documented  in  [11],  by  allowing  programmers 
to  set  the  length  of  the  interval  on  which  the  uniform  dis¬ 
tribution  will  be  defined.  In  this  paper,  we  will  show  that 
the  uniform  distribution  is  far  from  being  optimal,  while  the 
exponential  distribution  is  much  more  efficient  (on  all  axes) 
and  close  to  optimality. 


function  FETCH(fcej/,  ttl) 

value  <—  CACHEREAD(fcej/) 
if  ! value  then 

value  <—  RecomputeValue() 

CacheW RiTE(fcet/,  value,  ttl) 

end 

return  value 

end 

Figure  1:  Typical  pattern  for  retrieving  an  item  in 
cache  with  a  time-to-live.  Here  ttl  is  the  time-to-live 
of  the  cache  item:  after  Cache Write(/cey,  value,  ttl)  is 
called,  the  key  will  be  present  in  the  cache  for  ttl 
units  of  time  after  which  it  will  expire.  The  call  to 
RecomputeValue()  is  typically  expensive. 

The  rest  of  the  paper  is  organized  as  follows.  We  start 
off  with  describing  a  framework  to  model  stampedes  in  Sec¬ 
tion  2.  Our  results  are  then  detailed  in  Section  3.  The  main 
part  of  our  analysis  is  in  Section  4.  Section  5  contains  some 
implementation  notes,  and  Section  6  contains  the  results  of 
our  experiments.  The  appendix  contains  the  proofs  missing 
from  the  main  body. 

All  the  logarithms  in  this  paper  are  assumed  to  be  natural 
(that  is,  loge  =  1,  where  e  is  the  Napier’s  constant). 

2.  MODEL 

In  this  section  we  define  a  general  framework  to  model 
the  effect  of  stampedes  and  the  efficacy  of  probabilistic  early 
expirations. 

Without  loss  of  generality,  we  restrict  our  attention  to  an 
arbitrary  item  in  cache  whose  expiration  time  we  assume  to 
be  r.  We  assume  that  the  recomputation  of  the  item  takes 
one  unit  of  time1 .  We  will  use  some  of  the  basic  terminology 
of  queueing  theory,  and  of  probability  theory  —  the  concepts 
that  we  will  use  can  be  found  in  many  text-books  (e.g.,  [6,9]). 

2.1  Process  rate 

To  capture  the  notion  of  process  rate,  we  consider  a  (pos¬ 
sible  infinite)  sequence  of  processes  accessing  the  item  in 
cache  at  non-increasing  times  {«;};.  Given  a  certain  n  >  0 
representing  the  rate  of  processes,  we  model  the  inter-arrival 
times  (si  —  Si-i)  as  independent  samples  from  a  non-negative 
distribution  with  expectation  ^ ,  so  that  in  average  n  pro¬ 
cesses  will  access  the  item  in  a  single  unit  of  time. 

Definition  1  (Process  arrival)  LetX  be  an  arbitrary  non¬ 
negative  distribution  with  expectation  1  and  standard  devi¬ 
ation  ax.  The  process  arrival  is  defined  by  inter-arrival 
times  drawn  independently  from  Hence,  inter- arrivals 
have  mean  -  and  standard  deviation  — . 

n  n 

As  a  notable  example,  the  well-known  Poisson  point  pro¬ 
cess  [13]  of  inter-arrival  mean  ^  —  satisfies  ax  =  1  (that 

is,  mean  and  standard  deviation  are  equal). 

It  is  important  to  notice  that  since  the  recomputation  of 
the  item  takes  one  unit  of  time  and  the  process  arrival  is 
such  that  n  processes  access  the  item  in  a  unit  of  time,  we 
have  stampedes  of  n  processes  (in  average)  if  no  form  of 
stampede  prevention  is  implemented. 

This  is  without  loss  of  generality  as  we  can  simply  scale 
the  process  rate  to  allow  for  a  different  recomputation  time. 
See  Section  5  for  a  more  detailed  discussion. 
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function  FETCH(fcej/,  ttl;  T>) 

value,  expiry  «—  CACHEREAD(fcej/) 
gap  ~  V 

if  ! value  or  Time()  +  gap  >  expiry  then 
value  <—  RecomputeValue() 

C ac he W rite ( et/ ,  value,  ttl) 

end 

return  value 
end 

Figure  2:  Probabilistic  early  expiration  of  a  cache 
item.  Here,  D  is  a  probability  distribution  with  non¬ 
negative  support  (i.e.  gap  >  0).  The  variable  expiry 
represents  the  time  at  which  the  key  expires  from 
the  cache. 

2.2  Probabilistic  early  expirations 

We  now  proceed  with  formalizing  probabilistic  early  expi¬ 
rations.  As  previously  discussed,  this  approach  attempts  to 
mitigate  stampedes  by  having  the  processes  possibly  regen¬ 
erate  the  item  in  cache  before  it  expires,  where  this  decision 
is  taken  probabilistically  and  independently  by  each  process. 
As  depicted  in  Figure  2,  this  probabilistic  choice  can  be  in¬ 
terpreted  as  each  process  pretending  to  be  sent  sometime  in 
the  future  and  checking  if  at  that  time  the  item  would  be 
expired.  It  is  evident  that  the  crux  of  this  approach  lies  in 
choosing  a  distribution  V  to  stochastically  decide  the  time 
gap  each  process  employs. 

Definition  2  (Gap  distribution)  A  gap  distribution  T>  is 
any  distribution  defined  over  t  £  K-°. 

For  example,  the  Cache  Handling  Interface  (CHI)  for  Perl, 
sets  T>  to  the  uniform  distribution  over  the  interval  [0,£], 
where  f  is  a  user-specified  parameter  to  tune  the  tradeoff 
between  stampede  prevention  and  how  early  expirations  can 
happen. 

For  a  given  gap  distribution  V,  we  have  that  a  process 
accessing  the  item  in  cache  (with  expiration  r)  at  time  s, 
will  regenerate  it  in  either  of  the  two  following  cases: 

(i)  Early  expiration :  When  s  <  r  and  s  +  Y  >  r,  where 
Y  is  sampled  from  V. 

(ii)  Regular  expiration:  When  s  >  r.  The  cache  item  is 
expired  at  this  point  so  the  process  needs  to  refresh  it. 

Observe  that  as  the  cache  access  time  s  gets  closer  to  r, 
the  probability  Prv^£)(s  +  Y  >  r)  of  an  early  expiration 
increases.  For  example,  in  the  case  of  CHI,  this  probability 
increases  linearly  as  s  approaches  r. 

2.3  Effectiveness 

We  finally  proceed  with  defining  how  effective  a  gap  dis¬ 
tribution  T>  is.  On  one  hand,  early  expirations  may  reduce 
large  stampedes  since  only  a  fraction  of  the  processes  ac¬ 
cessing  the  item  will  stampede  on  an  early  expiration;  at 
the  same  time,  we  also  do  not  want  to  expire  the  item  too 
much  earlier  than  the  desired  expiration  time.  These  are  the 
two  quantities  that  we  care  about. 

Definition  3  (Stampede  size)  Fix  an  inter-arrival  distri¬ 
bution  X,  a  process  rate  n  and  a  gap  distribution  V.  Let 
Z  =  Z(X,n,V )  be  the  random  variable  denoting  the  first 


time  a  process  regenerates  the  item.  The  stampede  size 
Sx,n,v>  is  the  number  of  processes  re-generating  the  item  in 
the  time  interval  [Z,  Z  +  1). 

Observe  that  if  Z  >  r  (that  is,  no  process  regenerates  the 
item  before  it  expires)  then  we  run  into  a  regular  expiration 
causing  a  stampede  of  roughly  n  processes  during  the  time 
interval  [Z,  Z  +  l).  Analogously,  if  Z  =  r  — e,  with  0  <  e  <  1, 
then  there  will  be  a  stampede  of  roughly  (1  —  e)  •  n  processes 
in  the  time  interval  [r,  r  +  (1  —  e)). 

Definition  4  (Early  expiration  gap)  Fix  an  inter-arrival 
distribution  X,  a  process  rate  n  and  a  gap  distribution  V. 
Let  Z  =  Z{X,  n,  T>)  be  the  random  variable  denoting  the  first 
time  a  process  regenerates  the  item.  The  early  expiration 
gap  Ti.  n,T>  =  max{r  —  Z,  0}  is  how  much  earlier  than  the 
regular  expiration  time  the  early  expiration  occurred  ( or  zero 
if  no  early  expiration  occurs). 

A  low  early  expiration  gap  is  particularly  important  in  ap¬ 
plications  where  the  cache  item  contains  some  periodic  (typ¬ 
ically  hourly  or  daily)  statistics. 

Note  that  both  stampede  size  and  early  expiration  gap 
are  random  variables  that  depend  on  the  randomness  of  the 
process  distribution  X  and  of  the  gap  distribution  V. 

Intuitively,  if  the  gap  distribution  V  allows  for  very  early 
expirations,  then  it  should  be  more  effective  in  reducing 
large  stampedes.  On  the  other  hand,  if  it  only  allows  ex¬ 
pirations  close  to  the  desired  expiration  of  the  item,  then 
stampedes  are  more  likely  to  be  large.  How  effective  a  gap 
distribution  V  is  a  combination  of  these  two  criteria. 

Definition  5  (Effectiveness)  Fix  an  inter-arrival  distri¬ 
bution  X.  Then  we  say  that  a  gap  distribution  T>  is  (s,q)- 
ejfective  if 

1.  £[S]  <  (1  +  o„(l))s 

2.  E[T]  <  (l  +  o„(l))7 

(In  the  definition  above  the  “little-o”  notation  on(  1)  hides 
factors  going  to  0  as  n  grows2.) 

Consider  a  scenario  where  early  expirations  are  never  done. 
This  scenario  is  obtained  by  a  gap  distribution  V o  that  as¬ 
signs  probability  one  at  t  =  0  and  zero  otherwise.  Then  we 
have  that  T> o  is  (n,  0)-effective,  as  the  early  expiration  gap 
is  always  zero  but  the  n  processes  (in  expectation)  accessing 
the  cache  between  time  r  and  r  +  1  will  all  regenerate  the 
cache  item. 

The  core  problem  addressed  in  this  paper  is  whether  we 
can  find  a  distribution  V  which  gives  the  best  of  both  worlds. 
That  is,  a  distribution  T>  which  can  substantially  reduce  the 
size  of  the  stampedes,  while  keeping  the  early  expiration  gap 
low. 

3.  OUR  RESULTS 

We  first  consider  the  uniform  distribution  T>  =  U(0,£), 
used  for  instance  by  the  Perl  Cache  Handling  Interface  (CHI). 
While,  by  tuning  £,  this  distribution  is  able  to  reduce  stam¬ 
pedes  by  increasing  the  early  expiration  gap,  we  have  that 
this  trade-off  has  a  linear  dependence  in  £.  In  particular,  we 
can  show  the  following  result. 

2 By  definition,  f(n)  =  o(g(n))  if  lim™-^  =  0. 
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Theorem  6  (Uniform)  The  uniform  distribution  U(0,tf) 
is  no  better  than  (7^, -effective. 

This  means  that  if  we  want  to  reduce  the  stampede  size  to 
ffn,  then  we  need  to  allow  for  expirations  as  early  as  ffn/2. 
I11  practice  this  may  not  been  good  enough:  for  example, 
suppose  we  have  a  frequently-accessed  cache  item  that  gets 
recomputed  once  a  day  and  whose  recomputation  takes  one 
minute;  then  if  we  have  10,  000  processes  accessing  the  item 
in  a  minute,  we  would  need  to  allow  expirations  as  early  as 
50  minutes  to  reduce  the  stampede  size  to  100  processes. 

We  then  propose  to  use  an  exponential  distribution  of  pa¬ 
rameter  A  whose  implementation,  XFetch,  is  extremely 
simple  (see  Section  5).  We  are  able  to  show  that  the  ex¬ 
ponential  distribution  is  able  to  drastically  reduce  the  size 
of  a  stampede  while  keeping  expirations  close  to  the  desired 
expiration  time. 


Definition  10  (Early  expiration  probability)  Fix  a  gap 

distribution  V .  We  define  fx>  ( y )  as  the  probability  that  a 
y-early  process  performs  an  early  expiration  when  sampling 
from  V . 


f-o  ( V ) 


&P  >  »> 


=  1  —  Pr  (Y  <  y). 

Y~T>K  ~ 


4.1  Uniform  distribution 

In  this  section  we  instantiate  the  gap  distribution  V  with 
the  uniform  distribution  [7(0,  £);  this  will  serve  both  as  a 
warm-up  for  some  of  the  techniques  that  we  will  use  in  the 
paper,  as  well  as  a  proof  of  Theorem  6.  By  definition  of 
U( 0,£),  we  have  that 


fu( 0,5)  ( V )  = 


1  —  |,  for  0  <  y  <  £ 
0,  for  y  >  £ 


Theorem  7  (Exponential)  The  exponential  distribution 
Exp(A)  is  ((eA  -  l)(x  +  e)’  X  ^ogn) -effective. 

For  example,  for  A  =  1,  we  obtain  early  expiration  gap  of 
size  log  n  with  stampedes  of  size  e—  - .  Going  back  to  our  ex¬ 
ample  above,  an  exponential  distribution  with  A  =  1  would 
reduce  the  stampede  size  to  merely  e  —  -  «  2.4  processes 
without  having  expirations  earlier  than  loge(10,000)  fts  9.2 
minutes. 

It  is  natural  to  ask  whether  there  exists  a  distribution 
that  would  guarantee  a  constant  early  expiration  gap  (rather 
than  logn)  while  keeping  the  stampede  low  to  a  constant. 
We  show  that  this  is  not  possible  and,  in  fact,  that  the 
exponential  distribution  is  optimal  in  this  respect. 

Theorem  8  (Optimality)  Consider  any  distribution  V  that 
is  independent  of  n.  If,  for  each  n,  V  has  early  expiration 
gap  at  most  ^  log  n,  then  it  has  expected  stampede  size  at 
least  en^A-> . 

For  concreteness,  observe  that  Theorem  8  implies  that  the 
Exponential  distribution  is  optimal  in  the  full  range  of  A 
—  e.g.,  if  we  want  the  early  expiration  gap  to  be  at  most 
log  log  n,  then  the  expected  stampede  size  has  to  be  at  least 
en(iog  n/  log  log  n) ,  ancL  ^e  Exponential  distribution  matches 
this  bound.  Observe  that  the  above  optimality  theorem 
holds  if  D  is  independent  of  the  process  rate  n  —  that  is, 
if  the  algorithm  cannot  make  any  guess  on  the  process  rate. 
In  Section  7,  we  will  see  that  if  we  have  an  approximate 
knowledge  of  n,  then  we  can  make  the  early  expiration  gap 
smaller  than  O(logn)  while  keeping  the  expected  stampede 
size  to  a  constant. 

We  then  evaluate  the  performance  of  XFetcii  on  real- 
world  and  synthetic  datasets.  Our  experiments  show  that 
our  approach  out-performs  current  methods  from  all  angles 
even  when  A  =  1.  I11  addition,  the  experimental  results  show 
that  our  approach  is  very  robust  to  bursts. 


Observe  how  the  probability  of  a  process  performing  an  early 
expiration  increases  linearly  as  the  time  approaches  the  item 
expiration. 

We  will  show  that  the  uniform  distribution  fails  to  achieve 
good  efficiency  even  for  the  simple  case  where  the  process 
inter-arrival  times  are  all  equal  to  - ,  that  is  when  the  process 
inter-arrivals  distribution  is  such  that  crx  =  0. 

The  following  lemma  shows  that  the  early  expiration  gap 
tends  to  £  as  n  grows. 

Lemma  11  (Early  expiration  gap  for  U( 0,£))  LetT  = 
T(7(0i£)  be  the  early  expiration  gap.  For  any  a  =  a(n)  >  0, 
we  have  that 


< 


e -(a2f/n-l) 
n 


Proof.  Consider  any  0  <  y  <  £.  In  order  to  have  early 
expiration  gap  T  <  y,  it  must  be  that  no  x-early  process 
with  y  <  x  <  £  performs  an  early  expiration.  The  proba¬ 
bility  that  a  x-early  process  does  not  perform  expiration  is 
1  —  fu( 0,0  {%)■  Since  the  cadence  of  the  processes  is  exactly 
we  can  write 

n  ’ 


l 


<  exp 


»= 0 
rn(£—v)— 1 

'  z= 0 


n(i-y) 

Pf  (T<y)<  I]  (!-Mo,o(U 

i=0 
n(€-y) 

-  n  (1 


*= 0 

/  n(£-y) 

exp  E  M1 


1 


log  > 1  ~  < ' dz 


4.  ANALYSIS 

In  this  section  we  assume  that  the  item  in  cache  we  con¬ 
sider  has  expiration  time  r.  We  start  with  a  couple  of  defi¬ 
nitions  that  will  turn  out  useful  for  the  analysis. 

Definition  9  (y-early  process)  For  y  >  0,  we  say  that  a 
process  is  y-early  if  it  arrives  at  time  r  —  y.  In  other  words, 
a  y-early  process  is  a  process  that  accesses  the  item  in  cache 
y  units  of  time  before  its  expiration  time. 


where  the  last  step  holds  since  log  ^1  —  is  decreasing 

i. 

By  solving  the  integral,  we  get 
Pr)^  <  y)  <  exp  ^— ( ny  +  1)  log  ^  +  (ny  +  1)  —  n^j 
g-«(5- y)+l 


in 
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By  substituting  y  =  (1  —  -)£,  we  can  conclude 

eij"5(1-j;)e-05+1 


g-ia2?/"-!) 


where  the  second  inequality  holds  since  1  —  x  <  e  x  for  any 
0  < x  <  1.  □ 


Applying  the  lemma  above  with  a  =  a(n)  =  \J  1+1°s" ,  we 
can  conclude  that  the  early  expiration  gap  is  at  least  (1  — 
=  (1  —  o(l))£  with  probability  at  least  1  —  O(l-). 
This  also  implies  that  E[T]  >  (1  —  o(l))£. 

To  establish  Theorem  6,  it  is  left  to  show  that  the  stam¬ 
pede  size  decreases  linearly  with  £,  which  we  do  in  the  fol¬ 
lowing  lemma. 

Lemma  12  (Stampede  size  for  £7(0,  £))  Let  S  =  <S'c/(o,£) 
be  the  stampede  size.  Then, 

EW  >  | 

Proof.  Since  fu(o,£)  ( y )  increases  as  y  decreases  and  the 
early  expiration  gap  cannot  be  more  than  £,  we  have  that 
the  size  of  the  stampede  starting  exactly  £  units  of  time 
before  the  expiration  is  a  lower  bound  on 
stampede.  Hence, 

n  /  .  \  n 

®[si>E/™.0(e-y-E^ 


the  size  of  any 


n  +  1 


□ 

4.2  Exponential  distribution 

In  this  section  we  instantiate  the  gap  distribution  V  with 
the  exponential  distribution  Exp(A).  By  Definition  10  and 
the  fact  that  Pr(E  <  y)  =  1  —  e~Xy  for  Y  ~  Exp(A),  we 
have  that  the  probability  that  a  j/-early  process  regenerates 
the  item  in  cache  is 

/ExpW  (2/)  =  i-^Pip(A)(y<y)  =  e-^. 

Since  we  are  dealing  with  the  general  class  of  inter-arrival 
distributions  as  per  Definition  1,  we  are  going  to  use  Cheby- 
shev’s  lemma  to  bound  the  number  of  processes  present  in 
a  specific  interval. 

Lemma  13  (Chebyshev  [6,9])  Let  X  be  a  random  vari¬ 
able  with  finite  expected  value  p  and  finite  non-zero  variance 
a1 .  Then  for  any  real  number  k  >  0, 

Pr(\X-p\>k)<^. 

The  following  lemma  uses  Chebyshev’s  bound  to  establish 
that  with  high  probability  the  number  of  processes  present  in 
any  interval  of  unitary  length  is  tightly  concentrated  around 
its  expectation  n.  (The  proofs  of  all  claims  in  this  section 
are  deferred  to  the  Appendix.) 


Lemma  14  (Concentration)  Let  the  processes  be  distributed 
according  to  the  inter-arrival  distribution  from  Definition  1. 
Consider  any  interval  of  unit  length  and  let  N  be  the  number 
of  processes  in  the  interval.  For  any  real  number  5  >  0,  we 
have 

Pr  (TV  >  (1  +  8)n)  <  ^ 

ozn 

For  any  real  number  0  <  5  <  1,  we  have 

Pr(W  <  (1 -«)»)<  h-Afl  <  A 


The  following  lemma  shows  that  the  early  expiration  gap 
will  not  exceed  v  =  lyi  log  n  with  high  probability.  The  cru¬ 
cial  observation  in  the  proof  is  that  each  of  the  (roughly)  n 
processes  in  the  interval  right  before  this  threshold  performs 
an  early  expiration  with  probability  at  most  /exp(A)  (y)  = 
yi ipj,  so  the  probability  that  at  least  one  of  them  will  re¬ 
generate  the  item  is  at  most 

Lemma  15  (Early  expiration  gap  for  Exp(A))  LetT  = 
TexP(a)  be  the  early  expiration  gap.  Then  for  any  e  >  0, 

8  >  0, 


Pr 


T  >  —4—  log  n  )  < 


1  +  5 


nc(  1  —  e~x) 


The  analysis  of  the  stampede  size  is  more  subtle.  It  es¬ 
sentially  establishes  that  the  probability  of  an  expiration 
as  early  as  i  +  ^logn  is  roughly  e~e  ~1>  and  entails 
stampedes  of  size  eA\  The  expected  stampede  size  is  then 
Eg-eAV(eA-l)gAi~(eA  -1)(I  +  i). 

Lemma  16  (Stampede  size  for  Exp(A))  Let 

S  =  SexP(a)  be  the  stampede  size.  Then,  for  any  5  >  0, 


E,S,<(1  +  o(^)) 


Theorem  7  follows  by  having  e  slowly  approach  zero  (e.g. 
e  =  °p n )  in  Lemma  15  (implying  an  expected  early  ex¬ 
piration  gap  of  at  most  (1  +  o(l))ylogn);  and  having  S 
approach  zero  (e.g.  8  =  p^y)  in  Lemma  16. 

4.3  Optimality 

We  will  use  a  Poisson  point  process  as  model  for  inter¬ 
arrival  distribution  [13].  That  is,  in  the  notation  of  Defini¬ 
tion  1,  we  have  X  =  Exp(l)  with  ax  =  1.  For  a  process  rate 
n,  we  then  have  that  the  number  X  of  processes  in  any  unit 
interval  is  distributed  like  a  Poisson  random  variable  with 

k  —n 

parameter  n:  Pr(X  =  k)  =  rL~h — ■ 

Consider  any  gap  distribution  V  independent  of  n.  By 
Definition  10,  fx>  ( y )  is  the  probability  that  a  y-early  process 
performs  an  early  expiration  according  to  the  distribution  V. 
Assuming  fx>  (y)  is  integrable3,  let  p;  =  f‘+1  fx>  {y)  dy,  for 
each  integer  i  >  0  —  that  is,  Pi  is  the  probability  that  some 
j/-early  process,  where  y  is  chosen  uniformly  at  random  in 
(i,  i  +  1],  performs  an  early  expiration.  We  will  show  that  pt 
cannot  be  much  larger  than  what  it  is  with  the  exponential 
distribution,  unless  the  early  expiration  gap  is  larger  than 
logn. 

3  Observe  that  fx>  ( y )  is  integrable  in  Riemann  terms  if  it 
is  monotone,  or  has  finitely  many  discontinuities;  fx>  ( y )  is 
integrable  in  Lebesgue  terms  if  it  has  at  most  countably 
many  discontinuities. 
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function  XFETCH(fcej/,  ttl',  (3  =  1) 

value,  A,  expiry  4—  CACHER,EAD(fce?/) 
if  ! value  or  Time()  —  A/3  log(RAND())  >  expiry  then 
start  <—  TimeQ 
value  <—  Recompute Value() 

A  4—  TimeQ  -  start 
C  AC  he  W  RiTE(key ,  (value,  A),  ttl) 
end 

return  value 


Exp(A),  A  =  1 

u(o,«),{  =  io 
u(  0,0,  (  =  20 


Figure  3:  Simple  implementation  of  cache  stam¬ 
pede  prevention  with  exponential  gap  distribution 
T>  =  Exp(^j).  The  parameter  (3  defaults  to  1  and 
already  provides  effective  prevention  against  cache 
stampedes.  It  can  be  increased  for  even  better  guar¬ 
antees  against  stampedes,  if  earlier  expirations  are 
not  a  concern. 
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Early  expiration  gap  (seconds) 

Figure  4:  Scatterplot  —  regeneration  time  is  10s. 


Lemma  17  Fix  any  e  >  0.  Suppose  that,  for  each  n  >  2, 
the  early  expiration  gap  Tt>  satisfies  E[Tt> ]  <  elog(n  —  1). 
Then  pi  <  e_(1_e  /  )«  for  each  integer  i  >  0. 

We  will  now  use  Lemma  17  (which  upper  bounds  the  prob¬ 
ability  that  a  process,  whose  arrival  is  chosen  uniformly  at 
random  in  a  window  of  size  1,  performs  an  early  expiration) 
to  prove  Lemma  18  —  which,  then,  directly  entails  Theo¬ 
rem  8,  our  main  lower  bound. 

Lemma  18  Let  e  >  0  be  small  enough.  If,  for  each  n  >  2, 
the  early  expiration  gap  Tt>  satisfies  E[Tt>\  <  elog(n  —  1) 
then  the  expected  stampede  is  at  least  en(1^e\ 

5.  IMPLEMENTATION  NOTES 

In  this  section  we  present  an  explicit  implementation  of 
our  cache  stampede  prevention  approach,  XFetch  (taking 
its  name  by  our  use  of  the  exponential  function). 

The  discussion  thus  far  assumed  (without  loss  of  gener¬ 
ality)  that  the  recomputation  of  the  item  in  cache  takes 
one  unit  of  time.  This  allowed  our  analysis  to  have  the  gaps 
sampled  from  Exp(A)  to  be  independent  of  a  particular  time 
unit.  In  practice  though,  the  gaps  we  sample  from  Exp(A) 
needs  to  be  scaled  by  the  recomputation  time.  Figure  3 
shows  how  this  time  A  can  be  recorded  upon  recomputa¬ 
tion  of  the  item  and  stored  as  part  of  the  cache  value.  The 
scaled  gap  —  A/3  log(RAND0)  corresponds  to  sampling  from 
V  =  Exp(4)  and  scaling  by  a  factor  A. 

The  pseudo-code  assumes  that  the  cache  server  is  able 
to  provide  the  expiration  time  (expiry)  of  the  item  upon  a 
cache  read  of  the  corresponding  key.  If  this  is  not  the  case, 
this  expiration  can  be  stored  also  as  part  of  the  cache  value, 
which  will  then  become  (value,  A,  Time()  -I -ttl).  Also,  to 
avoid  accumulating  the  gaps  given  by  the  early  expirations, 
we  can  simply  adjust  the  ttl  <—  ttl  +  (expiry  —  Time())  right 
before  the  cache  write. 

6.  EXPERIMENTS 

In  this  section  we  describe  experimental  results  based  on 
the  results  discussed  thus  far. 

For  our  experiments  we  use  a  real  dataset  that  consists 
of  a  week  of  requests  for  a  popular  cache  item  used  in  the 
www .  goodreads .  com  website.  In  particular  the  cache  item  in 


Stampede  size 


Figure  5:  Stacked  histogram  of  distribution  of  stam¬ 
pede  sizes  —  regeneration  time  is  10s. 


consideration  is  a  hourly  statistic  for  the  most  popular  tags 
used  in  all  of  the  user-created  quotes.  The  recomputation 
of  this  cache  item  takes  about  10  seconds.  The  inter-arrival 
time  of  the  requests  in  this  dataset  is  about  0.07s  with  a 
standard  deviation  of  0.25s.  In  the  terminology  of  Defini¬ 
tion  1,  this  means  that  ax  ~  (0.25/0.07)  «  3.6.  Since  the 
time  to  regenerate  the  item  is  about  10  seconds,  we  have 
that  the  process  rate  is  n  ~  140  in  average. 

Figure  4  shows  a  scatter  plot  where  each  data  point  for  a 
specific  distribution  corresponds  to  the  regeneration  of  the 
cache  item  with  respect  to  a  specific  hour-long  interval  dur¬ 
ing  the  week  of  data.  Specifically,  consider  a  data  point 
(x,  y)  of  a  specific  distribution  V  and  say  it  corresponds  to 
a  specific  hour-long  interval  ending  at  time  t.  Then  this 
data  point  signifies  that  out  of  all  processes  in  that  inter¬ 
val,  the  first  one  to  perform  an  early  expiration  did  so  x 
seconds  before  r,  and  caused  a  stampede  of  size  y  (that  is, 
y  —  1  more  processes  performed  an  early  expiration  during 
the  10-second  window  starting  at  time  t  —  x). 

The  exponential  gap  distribution  Exp(A)  with  A  =  1  clearly 
outperforms  the  uniform  distribution  U( 0,£)  with  £  =  10, 
both  in  terms  of  stampede  size  and  early  expiration  gap. 
Even  when  allowing  the  uniform  distribution  to  perform  ex- 


891 


Figure  6:  Stampede  size  as  a  function  of  the  early 
expiration  gap  —  regeneration  time  is  10s. 

{ 
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Figure  7:  Stampede  size  as  a  function  of  the  distri¬ 
bution  parameter  —  regeneration  time  is  10s. 


pirations  twice  as  early,  by  setting  £  =  20  (which  has  been 
suggested  to  be  a  good  choice  of  parameter  for  the  uniform 
distribution  in  CHI  [11, 12]),  we  still  get  stampedes  of  much 
larger  size  than  the  exponential  distribution.  This  fact  is 
made  more  clear  in  Figure  5  where  we  show  the  distribu¬ 
tion  of  stampede  sizes.  For  the  exponential  distribution, 
most  stampedes  have  size  1  (i.e.,  no  stampede)  or  2,  and 
no  stampede  is  larger  than  8.  On  the  hand,  the  uniform 
distribution  shows  average  stampede  size  closer  to  10  with 
occasional  dangerous  stampedes  of  size  20  or  more. 

Figure  6  shows  the  average  stampede  size  as  a  function  of 
the  average  early  expiration  gap  where  the  average  is  taken 
over  all  the  hour-long  intervals  with  100  trials  per  interval. 
The  different  values  are  obtained  by  varying  the  distribu¬ 
tion  parameter.  It  is  striking  how  stampedes  of  size  less 
than  10  (respectively,  less  than  5)  are  achieved  with  expira¬ 
tions  that  are  less  than  20  seconds  (respectively,  less  than 
40  seconds)  early,  especially  considering  that  regenerating 
the  item  in  cache  takes  10  seconds.  Analogously,  Figure  7 
shows  the  average  stampede  size  as  a  function  of  the  distri¬ 
bution  parameter.  The  dashed  horizontal  line  is  at  y  =  2, 
and  shows  that  increasing  /3  =  y  to  1.5  already  drops  the 
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Figure  8:  Scatterplot  —  regeneration  time  is  1  minute 
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Figure  9:  Stacked  histogram  of  distribution  of  stam¬ 
pede  sizes  —  regeneration  time  is  1  minute 


average  stampede  size  to  less  than  2. 

Using  the  same  dataset,  we  now  move  to  the  scenario 
where  the  time  to  regenerate  the  item  is  1  minute.  Note 
that  this  could  also  be  viewed  as  increasing  the  process  rate. 
Indeed,  in  this  case  we  obtain  a  process  rate  of  n  «  840  in 
average,  which  could  potentially  lead  to  dreadful  stampedes. 
Figures  8-9  show  that  the  exponential  function  Exp(A)  with 
A  =  1  is  still  as  effective  as  before  in  preventing  stampedes, 
with  no  stampede  larger  than  10.  On  the  other  hand,  a 
higher  rate  heavily  penalizes  the  uniform  distribution  which 
exhibits  alarming  stampedes  of  size  over  80  and  50,  for 
£  =  10  and  £  =  20,  respectively.  Figures  10-11  complete 
the  picture  by  showing  average  stampede  size  under  5  with 
early  expiration  gap  less  than  5  minutes,  still  when  using 
Exp(A)  with  A  =  1. 

6.1  Bursts 

In  this  section  we  demonstrate  the  robustness  of  our  ap¬ 
proach  to  sudden  bursts  of  requests.  We  generate  a  synthetic 
sequence  of  requests  using  the  following  model  of  bursts  [7] : 
each  time  interval  is  either  in  a  “low”  or  a  “high”  state,  and 
from  an  interval  to  the  next  we  change  state  with  probability 
p.  In  the  low  (resp.  high)  intervals,  processes  are  generated 
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Figure  10:  Stampede  size  as  function  of  the  early 
expiration  gap  —  regeneration  takes  1  minute. 
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Figure  11:  Stampede  size  as  function  of  distribution 
parameter  —  regeneration  takes  1  minute. 

with  a  Poisson  point  process  of  rate  niow  (resp.  rihigh)-  For 
our  experiments,  we  test  our  approach  against  aggressive 
bursts  by  setting  niow  =  50,  rihigh  =  500  and  p  =  0.1.  Our 
intervals  are  of  length  10  seconds,  which  we  also  use  as  the 
time  to  regenerate  the  cache  item. 

Figures  12-13  show  how  the  exponential  function  is  prac¬ 
tically  immune  to  sudden  bursts,  especially  when  comparing 
this  behavior  with  that  of  Figures  4-5  (where  n  fts  140).  On 
the  other  hand,  the  comparison  show  that  the  uniform  dis¬ 
tribution  suffers  of  higher  fluctuations  in  stampede  size,  with 
peaks  around  70  and  45  for  £  =  10  and  £  =  20,  respectively). 

7.  EXTENSIONS:  KNOWN  RATE 

In  this  section  we  propose  algorithms  that  work  under  the 
assumption  that  the  process  rate  is  approximately  known. 
This  knowledge  of  n  will  allow  us  to  beat  the  theoretical 
lower  bound  proved  in  Theorem  8,  which  deals  with  algo¬ 
rithms  that  are  oblivious  of  n. 

We  will  start  by  showing  that,  if  we  have  a  constant  multi¬ 
plicative  approximation  of  n,  then  we  can  decrease  the  early 
expiration  gap  to  O(loglogn)  while  keeping  the  expected 
stampede  to  a  constant  independent  of  n. 


Figure  12:  Scatterplot  for  bursty  data 


Stampede  size 

Figure  13:  Stacked  histogram  of  distribution  of 
stampede  sizes  for  bursty  data. 

We  will  then  move  on  to  show  that,  with  a  very  precise 
knowledge  of  n,  the  early  expiration  gap  can  be  lowered 
to  0(log*  n)  —  that  is,  to  the  iterated  natural  logarithm  of 
n:  the  number  of  times  one  can  apply  the  log(-)  function, 
starting  from  n,  and  until  reaching  a  non-positive  value4. 
The  iterated  logarithm  grows  extremely  slowly  —  e.g.,  to 
get  log*  n  >  3,  it  is  necessary  that  n  >  3814279,  and  to  get 
log*  n  >  4,  n  has  to  be  larger  than  io1656520. 

This  other  algorithm  is  then,  in  theory,  much  more  ef¬ 
fective  than  the  other  when  the  process  rate  n  is  known  in 
advance;  unfortunately,  though,  the  process  rate  is  unstable 
in  real  applications.  Thus,  we  believe  that  the  iterated- 
logarithm  algorithm  is  not  going  to  be  effective  in  practice. 
In  terms  of  our  goal  of  understanding  the  limits  of  what  a 
cache-stampede  prevention  (early  expiration)  strategy  can 
achieve,  though,  it  is  quite  useful  to  describe  and  analyze 
this  algorithm. 

We  now  describe  our  approximately-known  rate  algorithm 
that  gives  a  O(loglogn)  expected  early  expiration  gap.  Es- 

4  The  iterated  natural  logarithm  log*  n  equals  0  if  n  <  1, 
and  equals  1  +  log*  (log  n)  otherwise. 
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sentially,  we  will  modify  the  (exponential)  gap  distribution 
of  Theorem  7  to  get  a  smaller  expiration  gap,  assuming  some 
knowledge  of  n.  Suppose  that  h  is  our  guess  of  n.  Then,  the 
distribution  V  =  T>n  will  be  defined  as  follows: 

_  _  f  Exp(A)  with  probability 

\  0  with  probability  1  —  > 

Observe  that,  for  y  >  0,  whenever  a  (/-early  process  samples 
T)  =  0,  then  it  can  be  disregarded  from  an  early  expiration 
point  of  view  as  it  will  not  trigger  a  recomputation.  In¬ 
tuitively,  then,  our  gap  distribution  V  reduces  the  process 
rate  from  n  to  O(logn).  Our  analysis  of  the  exponential  gap 
distribution  can  then  be  used  to  bound  the  performance  of 
V. 

Lemma  19  Suppose  ^  <  h  <  an,  for  some  a  >  1.  If 
a  <  e°(loe  log  ri-) ,  then  the  distribution  T>,\  is 

eX  -  l)  Q  +  g  j  ’  \  'oglogn^  -effective. 

Finally,  we  state  our  Lemma  about  the  best  strategy  we 
know  of  for  the  case  of  known  process  rates. 

Lemma  20  There  exists  a  known-rate  strategy  with  0(1) 
expected  stampede  size,  and  0(log*  n)  expiration  gap. 


Set  ii  =  eAl  ,  and  let  TV;  denote  the  number  of  (/-early 
processes  with  v  +  i<y<v  +  i  +  l.  Fix  any  5  >  0,  and  let 
Bi,j  be  the  event  €  [j,j  +  1). 

£i  /  AT  \ 

Pr (Ei)  <  'pPT(Ei\BiJ)PI{Bi,j)+Pr  >  {ii  +  1)J  • 

We  can  bound  Pr(_E;|_B;j)  <  (j  +  1)(1  +  8)ne~x^v+^  =  ( j  + 
l)(l+5)e_A'n_e.  For  1  <  j  <  U,  this  is  at  most  0( — )  = 

0(e  J  ).  For  1  <  j  <  we  can  apply  Lemma  14  to  obtain 
Pr (Bij)  <  Pr  >  l)  <  =  0(ax8~2n~1).  Fi¬ 

nally,  for  Pr  ^  (i+J)n  —  {^  +  ^))  ’  Lemma  14  yields  that  this 

probability  is  0(axn~1£~1)  =  0{axn~1e~Xl^2).  Combining 
these  observations, 

Pr  (Ei)  <  _|_  0(ax8~2n~^1+e^  e~  ^ )  +  0(axn~1e~^T ) 

ne 

(l  +  o)  -A i  .  2  C~2  -1,  -Ai/2 

=  - -e  +  Olax8  n  )e  . 

ne 

Finally,  the  closed  formula  for  geometric  series  yields 
Fr(T  >-)<£  P r<&)  =  +  O  (j|)  . 

i>  0  v  /  \  / 

□ 


8.  CONCLUSIONS 

In  this  paper  we  presented  XFetcii,  an  effective  approach 
against  cache  stampedes  based  on  probabilistic  early  expira¬ 
tions.  Our  approach  is  extremely  simple  to  implement  and 
requires  no  parameter  tuning.  Using  an  analysis  based  on 
general  stochastic  request  distributions,  we  show  that  our 
approach  is  immune  to  high  frequency  of  requests  in  terms 
of  reducing  stampedes,  and  that  the  relationship  with  how 
early  the  expirations  are  performed  is  optimal.  Experimen¬ 
tal  results  on  real-world  and  synthetic  datasets  demonstrate 
how  our  approach  out-performs  current  methods  and  also 
show  its  robustness  to  bursts  of  requests. 


APPENDIX 

Proof  of  Lemma  14.  If  TV  >  (1  +  8)n,  it  must  be  that 
^(i+5)n  <  i5  where  x,  are  i.i.d.  and  distributed  accord¬ 

ing  to  the  inter-arrival  distribution,  that  is  have  mean  4  and 
standard  deviation  — .  If  A'  =  V'(1+5)n  then  we  have 

n  £-~di= l 

px  =  E[X]  =  (1  +  5)  and  a2x  =  Var( X)  <  (1+^ .  Using 
Lemma  13, 

Pr(TV  >  (1  +  5)n)  <  Pr(X  <  1)  =  Pr(px  -X>  px  -  1) 
(1  +  8)ax  (1  +  8)<rx 
~  n(px  —  l)2  82n 

The  other  case  is  analogous.  □ 

Proof  of  Lemma  15.  Fix  any  e  =  e(n)  >  0,  and  let 
v  =  4±Jlogn.  For  any  i  >  0,  let  Ei  be  the  event  that 
T  £  (v  +  i,  v  +  *  +  1] .  Then,  by  a  union  bound  we  have 

Pr(T>^)<Ei>o  Pro¬ 


claim  21  Let  T  =  TBxp(A)  be  the  early  expiration  gap  and 
v  =  i  logn.  Then,  for  any  integer  —u<i<v  and  real 
0  <  8  <  1,  we  have 


Pr  (T  £  (u  —  i  —  1,  v 


*)) 


<E 

m=0 


2 ™x\  -(!-«) - 

82n  J 


Proof.  In  order  to  have  T  £  (v  —  i  —  1,  v  —  i),  it  must 
be  that  no  y-e arly  process  with  y  >  v  —  i  performs  an  early 
expiration.  Then,  if  A,  is  the  event  that  no  (/-early  process 
with  v  —  (j  +  1)  <  y  <  v  —  j  performs  an  early  expiration, 
we  can  write 


2  —  1 

Pr(T  £  (v  —  i  —  1,  v  —  i))  <  n p^)- 

i=—v 


Consider  any  0  <  <5  <  1.  To  bound  Pr(A/),  we  use  Lemma  14 
to  get  a  lower  bound  of  (1  —  8)n  on  the  number  TV,  of  (/-early 
processes  with  v  —  (j  +  1)  <  y  <  v  —  j.  We  then  use  the  fact 
that  for  each  of  these  (1  —  8)n  processes  the  probability  of 
performing  an  early  expiration  is  at  least  e  A(^— J) . 


Pr(A*>  =  PviAjlNj  >  (1  -  8)n)  Pr{Nj  >  (1  -  8)n) 

+  Pr(A,  |TV/  <  (1  -  8)n)  Pr(TV3-  <  (1  -  8)n) 

<Pr(Ai|JVJ->(l-J)n)  +  ^ 

(l-5)n  a2 

+  Pn 


=  11-- 


Ol 

82n 


K  -(1  -6)eV  O 

82n  ’ 
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where  we  used  the  fact  that  eAl'  =  —  and  the  fact  that 

n 

1  —  x  <  e~x .  The  above  implies  that 


Pr(T  G  (is  —  i  —  1,  is  —  i)) 

+ 


<  n 

j=-i 

v-\-i 

<£ 

m= 0 

u-\-i 

^  E 

m= 0 
2  v 

<£ 


i_1  /  , .  ji 

52n 


v  +  i 
m 


i—m  —  1 

n 


—  (1  — <5)eAj  . 

e  v  '  + 


(z/  +  i)(ix 
S2n 


j  =  -v 
•\  2  \  171  rn~  1 


2  \  m 
GI 

52n 


n 


(l-,5)eA 


/  2iscr- 


J  =  -V 

2  \  m  i  —  m  —  1 


\  S2r 


n 


-(l-5)eXj 


To  conclude  the  claim,  we  use  the  close  formula  for  geometric 
sums  and  the  fact  that  eA"  =  nT1 . 

i—m  —  1  /  i  —  m—1  \ 


n 


-(l-i)eA/  _ 


=  exp  I  — (1  —  <5) 


=  exp  -(1  -  5) 


eA(i-m)  _  e\v 

eA  —  1 


□ 

Proof  of  Lemma  16.  Let  u  =  ^  logn.  We  can  write 
£[S]  =  E[S\T  >  2 is]  Pr(T  >  2is)  + 

V 

E[S\T  G  (is  —  i  —  1,  is  —  »)]  Pr(T  G  (i/  —  i  —  1,  i/  —  i)). 


The  probability  that  a  y-early  process  with  y  >  2;/  updates 
the  cache  is  at  most  /bxp(A)  (2i/)  =  e^2Aly.  Therefore,  con¬ 
ditioning  on  an  early  expiration  gap  T  greater  than  2 is,  we 
have  that  E[S\T  >  2z^]  <  ne_2Al/  =  1/n.  Hence,  the  hrst 
term  is  0(n~  ). 

We  now  consider  the  summation  in  the  above  expression 
for  SIS'].  By  Claim  21  and  the  fact  that  E[S\T  G  (v  —  i  — 
l,i/  —  *)]  <  ne~x^~^  =  eAl,  we  have 


S[S|T  G  (i/  —  i  —  1,  i/  —  *)]  Pr(T  G  (i/  —  i  —  1,  is  —  *)) 

i=  —  is 

is  2u 


i=  —  is  m= 0 


\  £2r 


eA(i-m)_  1. 

-a-*) — a  i  " 


=-£ 


m=0 

2u 


<-£ 


<1  +  0 


2i/<72 

<52n 


2vax 

52n 


S2r 


Am  . 

e  >  e 


£ 

i=—is 

oo 

£ 


A(i-m)  _(1_l5)C  eA_x 


e  2^  e  e 

i=  —  o o 


Ai  -(1—5)  - 


£ 


Ai  -(1-5)- 
e  e  e 


i=  —  oo 


We  now  conclude  the  proof  by  showing  a  bound  on  the  series 
above.  Let  f(x)  =  eXxe~ce  and  x*  =  j-  log  -,  where  c  = 
We  have  that  f(x)  is  increasing  for  x  <  x* ,  and 
decreasing  for  x  >  x* .  Therefore,  using  the  fact  that  F(x)  = 


J  f(x)dx  =  — +;e  ce  ,  we  can  approximate  the  series  as 
follows: 


£  m 


£  /(*)  +  /(*’)+  £  /(*) 

i=  —  oo  i=x*  +1 


/x*  roo 

f(x)dx  +  f(x*)  +  f(x)dx 

-oo  J X* 

<  F(x*)  —  F(— oo )  +  f(x*)  +  F( oo)  —  F(x*) 
1  1 
cA  ce 

_  eA  -  If  1  1 

1  —  5  U  +  e 


□ 


Proof  of  Lemma  17.  Suppose  the  contrary,  that  is,  sup¬ 
pose  that  there  exists  an  i  for  which  pi  >  e“(1_e  ) « .  Fix 

n  =  [e^1'6  1  )«1 ,  and  observe  that 


log(n  -  1)  <  (l  -  e  1/3)  l~, 


which  implies  i  >  - — L_ .  e  .  log(n  —  1). 

By  a  standard  Chernoff-like  bound  (for  Poissonian  vari¬ 
ables),  we  have  that  the  probability  that  less  than  ^  pro¬ 
cesses  end  up  in  the  interval  [i,i  +  1]  is  at  most: 


2e 


—  n 


nn'2 

H  2)! 


=  6  ((2/e)”/2  •  n"1/2)  . 


The  exponential  distribution  of  the  Poisson  point  process 
is  such  that,  if  we  condition  on  k  process  showing  up  in 
that  interval,  the  distribution  of  each  of  those  k  processes  is 
uniform  at  random  in  the  interval. 

Therefore,  under  the  conditioning  that  at  least  §  pro¬ 
cesses  end  up  in  the  interval,  we  have  that  the  probability 
that  no  y-early  process  with  i  <  y  <  i  +  1  performs  an  early 
expiration  is  at  most: 


since  (1  —  <  e  x,  for  each  x  G  (0, 1]. 

Therefore,  the  expected  early  expiration  gap  is  at  least 

^1  —  e-1^2  —  0  [(2/e)n^  ■  tW1/2^  •  i  >  elog(n  —  1), 

a  contradiction.  □ 

Proof  of  Lemma  18.  Assume  n  <  |^e-1+(1-e  1/3)/£j . 
By  the  union  bound,  we  have  that  the  probability  that  the 
early  expiration  gap  is  more  than  1  is  at  most: 


Pr(Tx>  >  1)  <  n^^Pi  ^  n  £  e  ^  e  '  ^ e 

i=  1  i—1 

c_(l_e-l/8)/e 


=  n  ■ 

<  n  ■ 


1  —  e-(1_e_1/3)/e 

e-(l-e-1/3)/t  1 
1  —  e_1  —  e  —  1 


Suppose  that  po  >  e  ^  e  1  )+2e\  and  fix 
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Again,  we  have  that  the  probability  that  the  probability  that 
less  than  n/2  processes  end  up  in  an  interval  of  length  1  is 

at  most  0  ^(2/e)™^2  •  n_1,/2^.  The  expected  stampede  can 
then  be  lower  bounded  by: 

Pr(Ti  <  l)„5  >  £ [e-+(--'-)/-j 

>  ‘  Z  2  _  j 

3e  —  3 

=  e~2  Jl-e-1'3)/™  _  i 

3e2-3e 

>  efi(1/e). 


Suppose,  instead,  that  po  <  e_(1_e  )/(2e)_  Fix  n  — 

|^e_1+(1_e  1  )/f2e) / 3J .  The  probability  that  more  than  2 n 

processes  end  up  in  an  interval  of  length  1  can  be  shown  to 
be  (using  Poissonian  tail  bounds)  at  most 


We  then  have  that: 

00  ^  1  1  19 

Pr(Tx>  >  0)  <  3n  •  po  +  3n  V'  pi  <  — I - -  <  — . 

z — '  e  e  —  1  2D 

i=l 

If  Tt>  =  0,  that  is  no  process  performs  an  early  expiration, 
then  we  run  into  a  regular  expiration  with  a  stampede  of 
size  n.  Therefore  the  expected  stampede  is  at  least 

1  nu/e) 

20  n  “  6 

The  proof  is  concluded.  □ 


Proof  of  Lemma  19.  Recall  that  if  the  process  arrival 
distribution  is  X ,  then  the  distribution  of  the  interval  be¬ 
tween  two  consecutive  processes  is  Tin.  Without  loss  of 
generality,  say  that  the  expected  value  of  T  is  1  and  that 
<7z  is  its  standard  deviation.  Let  p  =  Consider  a  sec¬ 

ond  setting  where  the  rate  is  p  •  n,  with  the  exponential  gap 
distribution  Exp(A),  and  with  a  new  process  arrival  distri¬ 
bution  T' .  To  sample  from  X' ,  we  procede  as  follows:  first, 
we  flip  a  coin  with  head  probability  p  until  we  get  heads;  let 
k  be  the  number  of  coin  flips  (observe  that  k  is  distributed 
like  the  geometric  distribution  Geom(p)  with  parameter  p); 
then,  sample  k  i.i.d.  variables  X\ , . . . ,  AT  ~  Exp(A)  —  the 
sample  of  X'  will  be  equal  to  p  •  (Ai  +  AT  +  •  •  •  AT). 

A  simple  coupling  shows  that  the  original  process  (hav¬ 
ing  rate  n,  inter-arrival  distribution  I ,  and  gap  distribu¬ 
tion  V)  is  equivalent  to  the  new  process  (having  rate  p  •  n, 
inter-arrival  distribution  X' ,  and  gap  distribution  Exp(A)). 
Therefore,  after  having  computed  the  expectation  and  the 
variance  of  X' ,  we  can  apply  Theorem  7  to  get  our  claim. 
We  have 


Mx'  =  E[X']  =  p  •  E[Geom(p)]  ■  E[Exp(A)]  =  -, 
Varpr']  =  p2  •  (E[Geom(p)]  ■  Var[Exp(A)] 

+  E[Exp(A)]2  •  Var[Geom(p)]) 

=  p_  1-p  2 

A2  A2 


Therefore,  ax'  =  A  1 .  We  can  then  apply  Lemma  15  to 
get  the  early  expiration  gap  is  at  most  log^,n^  with  high 


probability.  Observe  that 

—  log(n/a)  <  pn  <  alog(an). 
a 

Thus,  the  early  expiration  gap  is  at  most  log  log  ”+log  01  with 
high  probability.  By  a  <  e°(log  log  n) ;  we  get  that  the  expi¬ 
ration  gap  is  at  most  (1  +  o(l))  •  log — . 

We  can  then  use  Lemma  16,  to  conclude  that  the  expected 
stampede  is  at  most: 


(l  +  O 
(1  +  0 


f  gj  log(p^)  \  _  eA  -  1 

y  52pn  ) )  1  —  5 

(  A~2  log(ct  log(aw))  \  A 
V  52log(n/a)  )) 


By  a  <  e°(log  los n),  jf  we  ief  shrink  to  0,  we  get  that  the 
expected  stampede  is  at  most 


(1+0(1))- 


eA  -  1 
1-5 


□ 


Proof  of  Lemma  20.  We  assume  that  the  rate  is  known. 
Let  rifc  =  1,  and,  for  i  >  1,  rii_i  =  en‘. 

Fix  some  k  >  1,  and  let  the  rate  n  be  equal  to  n  =  no. 
Observe  that  k  =  log*  n. 

Now,  given  a  time  —t,  fix: 

r*l-tn/2J  if  [tj  <  2k  +  1,  |TJ  even 
0  otherwise 

That  is,  for  a  time  in  (—00, —2 k  —  1],  the  probability  is 
0;  for  a  time  in  (—2k  —  1,  —  2k],  the  probability  is  n-1;  for 
a  time  in  (—2k,  —2k  +  1],  the  probability  is  0;  for  a  time  in 
(— 2fc+l,  — 2fc  +  2],  it  is  en-1;  for  a  time  in  (— 2fc  +  2,  —  2fc  +  3], 
it  is  0;  for  a  time  in  (—2k  +  3,  —2k  +  4],  it  is  een-1,  . . .,  for 
a  time  in  (—2, 1]  it  is  1,  and  for  a  time  in  (—1,  0]  it  is  0. 

If  the  process  passes  time  —2 i  —  1?  i  =  0, . . . ,  k,  without 
having  ever  refreshed  then,  in  expectation,  it  will  make  m 
refreshes  before  reaching  time  —2 i  +  1.  The  probability  Pi 
of  making  no  refresh  in  the  interval  (—2 i  —  1,  —2 i  +  1],  if  the 
process  has  not  refreshed  before  time  —2 i  —  1,  is  equal  to: 

Pi  =  (1  -  PiY  =  (i  -  "  -+  e“n‘  =  n-_V 

Observe  that  P,+i  •  ps  •  n  =  Pi+ 1  ■  m  =  1.  The  expected 
number  of  refreshes  in  the  interval  (— i  —  1,  —  i]  is  then  equal 
to: 

k 

Pk  '  Pk  —  1  *  *  *  P  +  1  •  Pi  •  n  —  J(  I  Pj  . 

j=i+2 

The  algorithm  will  refresh  with  probability  1  in  the  in¬ 
terval  (-2,-1].  Therefore  the  expected  stampede  is  upper 
bounded  by: 

k  —  2  k 

2+e  n  ^<3- 

i= 1  j=i+ 2 

The  early  expiration  gap,  on  the  other  hand,  is  at  most 
equal  to  2k  +  1,  with  k  =  log*  n,  since  the  probability  of  a 
refresh  is  0  if  t  <  —2k  —  1.  □ 
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